Show the code
import pandas as pd
import numpy as np
from lets_plot import *
LetsPlot.setup_html(isolated_frame=True)Course DS 250
[Tanner Hamblin]
For Project 1 the answer to each question should include a chart and a written response. The years labels on your charts should not include a comma. At least two of your charts must include reference marks.
How does your name at your birth year compare to its use historically?
type your results and analysis here
name object
year int64
AK float64
AL float64
AR float64
AZ float64
CA float64
CO float64
CT float64
DC float64
DE float64
FL float64
GA float64
HI float64
IA float64
ID float64
IL float64
IN float64
KS float64
KY float64
LA float64
MA float64
MD float64
ME float64
MI float64
MN float64
MO float64
MS float64
MT float64
NC float64
ND float64
NE float64
NH float64
NJ float64
NM float64
NV float64
NY float64
OH float64
OK float64
OR float64
PA float64
RI float64
SC float64
SD float64
TN float64
TX float64
UT float64
VA float64
VT float64
WA float64
WI float64
WV float64
WY float64
Total float64
dtype: object
(
ggplot(df_Tanner, aes(x = 'year', y = 'Total')) +
geom_line() +
labs(
x = "Year",
title = "Tanner is becoming less popular"
) +
geom_vline(xintercept = "2002", color = "red") +
theme_bw() +
geom_text(aes(x=[2001], y=[15], label=["Year I was born"]), size=8, color='red',hjust = 1)
)
# print("Hello")If you talked to someone named Brittany on the phone, what is your guess of his or her age? What ages would you not guess?
type your results and analysis here
| name | year | AK | AL | AR | AZ | CA | CO | CT | DC | ... | TN | TX | UT | VA | VT | WA | WI | WV | WY | Total | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 53205 | Brittany | 1968 | 0.0 | 0.0 | 0.0 | 0.0 | 5.0 | 0.0 | 0.0 | 0.0 | ... | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 5.0 |
| 53206 | Brittany | 1969 | 0.0 | 0.0 | 0.0 | 0.0 | 7.0 | 0.0 | 0.0 | 0.0 | ... | 0.0 | 5.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 12.0 |
| 53207 | Brittany | 1970 | 0.0 | 0.0 | 0.0 | 0.0 | 5.0 | 5.0 | 0.0 | 0.0 | ... | 0.0 | 7.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 32.0 |
| 53208 | Brittany | 1971 | 0.0 | 0.0 | 0.0 | 5.0 | 17.0 | 0.0 | 0.0 | 0.0 | ... | 0.0 | 14.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 81.0 |
| 53209 | Brittany | 1972 | 0.0 | 0.0 | 0.0 | 0.0 | 11.0 | 10.0 | 0.0 | 0.0 | ... | 8.0 | 14.0 | 16.0 | 0.0 | 0.0 | 0.0 | 5.0 | 0.0 | 0.0 | 158.0 |
5 rows × 54 columns
Mary, Martha, Peter, and Paul are all Christian names. From 1920 - 2000, compare the name usage of each of the four names in a single chart. What trends do you notice?
type your results and analysis here
Think of a unique name from a famous movie. Plot the usage of that name and see how changes line up with the movie release. Does it look like the movie had an effect on usage?
type your results and analysis here
| name | year | AK | AL | AR | AZ | CA | CO | CT | DC | ... | TN | TX | UT | VA | VT | WA | WI | WV | WY | Total | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 19325 | Anakin | 1998 | 0.0 | 0.0 | 0.0 | 0.0 | 5.0 | 0.0 | 0.0 | 0.0 | ... | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 5.0 |
| 19326 | Anakin | 1999 | 0.0 | 0.0 | 0.0 | 9.0 | 19.0 | 0.0 | 0.0 | 0.0 | ... | 0.0 | 15.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 61.0 |
| 19327 | Anakin | 2000 | 0.0 | 0.0 | 0.0 | 0.0 | 8.0 | 6.0 | 0.0 | 0.0 | ... | 0.0 | 12.0 | 0.0 | 0.0 | 0.0 | 5.0 | 0.0 | 0.0 | 0.0 | 44.0 |
| 19328 | Anakin | 2001 | 0.0 | 0.0 | 0.0 | 0.0 | 8.0 | 0.0 | 0.0 | 0.0 | ... | 0.0 | 7.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 15.0 |
| 19329 | Anakin | 2002 | 0.0 | 0.0 | 0.0 | 0.0 | 9.0 | 0.0 | 0.0 | 0.0 | ... | 0.0 | 8.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 17.0 |
5 rows × 54 columns
(ggplot(df_Anakin, aes(x = "year", y = "Total")) +
geom_line() +
labs(title = "Anakin is getting more popular", x = "Years") +
geom_vline(xintercept = "1999", color = "darkgreen") +
geom_vline(xintercept = "2002", color = "blue") +
geom_vline(xintercept = "2005", color = "red") +
geom_text(aes(x=[1999], y=[65], label=["Episode I"]), size=8, color='darkgreen',hjust = 0) +
geom_text(aes(x=[2002], y=[25], label=["Episode II"]), size=8, color='blue',hjust = 0) +
geom_text(aes(x=[2005], y=[100], label=["Episode III"]), size=8, color='red',hjust = 0)
)